home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
Cream of the Crop 1
/
Cream of the Crop 1.iso
/
PROGRAM
/
SED15.ARJ
/
SED.LST
< prev
next >
Wrap
File List
|
1991-09-22
|
25KB
|
661 lines
SED(1) USER DOCUMENTATION SED(1)
NAME
sed - the stream editor
SYNOPSIS
sed [-n] [-g] [-e script] [-f sfilename] [filename ]
DESCRIPTION
sed reads each filename line by line, edits each line according to a script
of commands as specified by the -e and -f arguments and then copies the
edited line to the standard output.
OPTIONS
The -e option supplies a single edit command from the next argument; if
there are several of these they are executed in the order in which they
appear. If there is just one -e option and no -f's, the -e flag may be
omitted. An -f option causes commands to be taken from the file sfilename;
if there are several of these they are executed in the order in which
they appear; -e and -f commands may be mixed. The script or sfilename can
be adjacent to the -e or -f or can be the next argument on the command
line. The -g option causes sed to act as though every substitute command
in the following script has a g suffix. The -n option suppresses the
default output.
SCRIPTS
A script consists of one or more sed commands of the following form:
[address[,address]] function [arguments]
Normally sed cyclically copies a line of input into a current text buffer,
then applies in sequence all commands whose addresses select that line and
then copies the buffer to standard output and clears the buffer. The -n
option suppresses normal output so that only commands which do output (e.g.
p) cause any writing to occur. Also, some commands (n, N) do their own
line reads, and some others (o, d, D) cause all commands following in the
script to be skipped (the D command also suppresses the clearing of the
current text buffer that would normally occur before the next cycle).
There is also a second buffer (called the 'hold space' that can be copied
or appended to or from or swapped with the current text buffer.
ADDRESSES
An address is: a decimal number (which matches that numbered line where
line numbers start at 1 and run cumulatively across files), or a '$' (which
matches the last line of input), or a '/regular expression/' (which matches
any line satisfying the expression. The following rules govern the address
matching:
* A command line with no addresses selects every input line.
* A command line with one address selects every input line that matches
that address.
* A command line with two addresses selects the inclusive range from the
first input line that matches the first address up to and including
the next input that matches the second. (If the second address is a
number less than or equal to the line number first selected, only one
line is selected.) Once the second address is matched sed starts
h**2 Documentation 21 September 1991 1
SED(1) USER DOCUMENTATION SED(1)
looking for the first one again; thus, any number of these ranges will
be matched.
* The second address may be in the form of '+number'. This means that
the command will stay selected for number lines after the first
address is satisfied.
* \?regular expression? where ? is any character is identical to
/regular expression/.
* The negation operator '!' preceding a function makes that function
apply to every line not selected by the address(es).
FUNCTIONS
In the following list of functions, the maximum number of addresses
permitted for each function is indicated in parentheses. An argument
denoted 'text' consists of one or more lines, with all but the last ending
with '\' to hide the newline. A command with this type argument must be the
last on any command line or -e argument. Otherwise multiple commands may
appear on a line separated by ';' characters. A command may have a
trailing comment indicated by a '#' character. Comment lines begin with a
'#'. Backslashes in text are treated as described in 'escape sequences
below; they may be used to protect initial whitespace against the
stripping that is done on every line of the script. An argument denoted
'label', 'rfile' or 'wfile' (which specify labels or file names) is not
processed for 'escape sequences'. Therefore a ';' or '#' terminates the
label or file name. This simplifies entering DOS style paths. Each 'wfile'
is created before processing begins. There can be at most 10 distinct
'wfile' arguments.
(1) a text Append the 'text' on output before reading the next input
line.
(2) b [label] Branch to the ':' command with the given 'label'. If no
'label' is given, branch to the end of the script.
(2) c text Change lines by deleting the current text buffer and at the
end of the address range, place 'text' on the output.
Start the next input cycle.
(2) d Delete the current text buffer. Start the next input cycle.
(2) D Delete the first line of the current text buffer (all
characters up to the first newline). Start the next input
cycle.
(2) g Replace the contents of the current text buffer with the
contents of the hold space.
(2) G Append the contents of the hold space to the current text
buffer.
(2) h Copy the current text buffer into the hold space.
h**2 Documentation 21 September 1991 2
SED(1) USER DOCUMENTATION SED(1)
(2) H Append a copy of the current text buffer to the hold space.
(1) i text Insert the 'text' on the standard output.
(2) l [w[file]] List current text buffer on standard output or to a file if
the -w option follows. Non ASCII printable characters are
expanded as shown in the 'escape sequence' section below.
(2) n Copy the current text buffer to standard output. Read the
next line of input into it. The current line number
changes.
(2) N Append the next line of input to the current text buffer,
inserting an embedded newline between the two. The current
line number changes.
(2) p Copy the current text buffer to the standard output.
(2) P Copy the first line of the current text buffer (all
characters up to the first newline) to standard output.
(1) q Quit. Perform any pending outputs (a or r commands) and
terminate sed.
(1) r rfile Read the contents of 'rfile'. Place them on the output
before reading the next input line.
(2) s /regular expression/replacement/flags
Substitute the 'replacement' for instances of the 'regular
expression' in the current text buffer. Any character may
be used instead of '/'. In the 'regular expression' and in
the 'replacement' text \1 - \9 are used to indicate the nth
subexpression indicated by a '\(...\)' expression in the
'regular expression'. In the replacement text an & may be
used to indicate the entire matched expression. If the
replacement text consists only of the a single '%'
character, then a copy of the replacement text for the
previous s command is used as the replacement text for this
command. 'Flags' are any of the following options, with
the following provisos: if present w must be the last one;
only the last of either p or P is used; and only the last
'n} is used.
g - Global. Substitute for all nonoverlapping instances
of the 'RE' rather than just the first one.
p - Print the current text buffer if a replacement was
made.
P - Print the first line of the current text buffer if a
replacement was made.
h**2 Documentation 21 September 1991 3
SED(1) USER DOCUMENTATION SED(1)
w[wfile] - Append the current text buffer to the file
argument as in a w command if a replacement is made.
Standard output is used if no file argument is given.
n - Where n can be 1 through 512. Perform only the nth
replacement. If g is also set or the -g option is
selected, this option means that the nth and all
succeeding substitutions should be performed.
(2) t [label] Branch to the ':' command with the given 'label' if any s
commands made any substitutions since the most recent read
of an input line or execution of a t or T. If no 'label'
is given, branch to the end of the script.
(2) T [label] Branch to the ':' command with the given 'label' if no s
commands have succeeded since the last input line or t or T
command. Branch to the end of the script if no 'label' is
given.
(2) w [wfile] Write the current text buffer to 'wfile'. If no 'wfile' is
given standard output is used.
(2) W [wfile] Write the first line of the current text buffer to 'wfile'.
If no 'wfile' is given standard output is used.
(2) x Exchange the contents of the current text buffer and hold
space.
(2) y /string1/string2/
Translate. Replace each occurrence of a character in
string1 with the corresponding character in string2. The
'/' may be any character not in 'string1' or 'string2'.
The lengths of the two strings must be equal.
(2) ! function All-but. Apply the function (or group, if function is '{')
only to lines not selected by the address(es).
(0) : label This command defines a label for b T and t commands.
(1) = Write a line containing the current line number to the
standard output.
(2) { Execute the following commands through a matching '}' only
when the current line matches the address or address range
given.
(0) } The { command marks the end of a grouping started by a '{'.
(0) An empty command is ignored.
h**2 Documentation 21 September 1991 4
SED(1) USER DOCUMENTATION SED(1)
ESCAPE SEQUENCES
The following escape sequences are used to represent unprintable characters
in 'text', 'regular expressions' and 'replacement' text. It is ignored in
'labels' and 'file's. If the character following the '\ 'is not list below
the '\' causes the character to be quoted during script input. The l
command also uses this convention.
\a - bell (ASCII 07)
\b - backspace (ASCII 08)
\e - escape (ASCII 27)
\f - formfeed (ASCII 12)
\n - newline (ASCII 10)
\r - return (ASCII 13)
\t - tab (ASCII 09)
\v - verticaltab(ASCII 11)
\xhh - the ASCII character corresponding to 2 hex digits hh.
\\ - the backslash itself.
REGULAR EXPRESSIONS ('REs')
Regular expressions can be built up from the following "single-character"
'RE's:
c Any ordinary character not listed below. An ordinary character
matches itself.
\ Backslash. When followed by a special character the 'RE' matches
the "quoted character" as listed in 'Escape Sequences' above. A
backslash followed by one of <,>,(,),{,} or 0...9 represents an
'operator' in a regular expression, as described below.
. Dot. Matches any single character except the NEWLINE at the end
of a line.
^ Carat. As the leftmost character in an 'RE' this constrains the
pattern to be an anchored match. That is it must match anchored
at the first character in the line. In any other position the ^
is an ordinary character.
$ The dollar sign as the rightmost character in an 'RE' matches the
NEWLINE at the end of the line. At any other position the $ is an
ordinary character.
^RE$ This requires the 'RE' to match the entire buffer.
[c...] A nonempty string of characters enclosed by square brackets
matches any single character in the string except the NEWLINE at
the end of the string. If the first character of the string is a
caret (^), then the 'RE' matches any character not in the string
except the NEWLINE at the end of the string. A '-' sign may be
used to express ranges of characters. For example the range '[0-
9]' is equivalent to the string '[0123456789]'. The '-' is
treated as an ordinary character if it occurs in the string at a
position that can not be part of a range. This construct is
called 'set definition'.
h**2 Documentation 21 September 1991 5
SED(1) USER DOCUMENTATION SED(1)
\{m\}
\{m,\}
\{m,n\\} When any of these constructs follow an ordinary character, a dot,
a 'set definition' or the '\n' construct. This construct matches
the previous construct for a range of occurrences. At least m
occurrences will be matched and at most n. "\{m,\}" matches at
least m occurrences and "\{m}" matches exactly m.
* When this follows an ordinary character, a dot, a 'set
definition' or the '\n' construct, this 'RE' matches 0 or more
occurrences of that construct. This pattern is called a
'closure'.
+ This pattern is similar to the star above but matches one or more
occurrences of the previous construct.
\< The sequence \< in an 'RE' requires that the scan position in the
line must be immediately following a character that can not be
part of a "word" and immediately preceding a character that can
be part of a "word". In this context a "word" is any sequence of
upper and lowercase letters, a numeral [0-9] or the underscore
character (_).
\> The sequence \> in an 'RE' requires that the scan position in the
line must be immediately following a character that can be part
of a "word" and immediately preceding a character that can not be
part of a "word".
\(...\) An 'RE' enclosed between the character sequences \( and \)
matches whatever the unadorned 'RE' matches, but saves the string
matched by the enclosed RE in a numbered substring register.
There can be up to nine such substrings in an 'RE', and the
parenthesis operators can be nested.
\n Match the contents on the nth substring register. When nested
substrings are present, 'n' is determined by counting the
occurrences of \( starting from the left.
// The empty 'RE' (//) is equivalent to the last 'RE' encountered in
the input processing.
ERROR MESSAGES
The following error messages may appear during the compilation phase of sed
processing all cause sed to terminate:
sed: bad expression 'hh' -- The escape sequence of "\x" did was not
followed by two hex digits
sed: bad value for match count on s command 'command' -- A maximum value of
512 is allowed for 'n' on an s command.
sed: cannot create 'file' -- The listed output file could not be opened
h**2 Documentation 21 September 1991 6
SED(1) USER DOCUMENTATION SED(1)
sed: cannot open command-file 'file' -- The 'file' on an -f argument could
not be opened
sed: command "command" has trailing garbage -- Command was not terminated
properly
sed: duplicate label 'label' -- The indicated 'label' appeared on more than
one ':' command
sed: error processing: 'argument' -- The 'argument' is incorrect either a
file name was missing or the g or n options had trailing garbage
sed: garbled address 'command' -- Improper 'regular expression' in an
address, line number in an address or + used in first address
sed: garbled command 'command' -- Error in the construction of the 'regular
expression' or 'replacement' in an s command, an ill-formed y command
or a 'null' character was found
sed: no addresses allowed for 'command -- The end of group (}) and label
command (:), can not have addresses
sed: no argument for -e -- The -e option did not have a 'script'
sed: no such command as 'command' -- The function in the 'command' was
illegal
sed: only one address allowed for 'command' -- The a i q r and = commands
allow only one address
sed: range error in set 'command -- A [...'x'-'y'...] construct was found
where y<x
sed: RE too long: 'command' -- Internal buffer overflow while processing a
'character set'
sed: too many commands, last was 'command' -- A maximum of 200 commands are
allowed
sed: too many labels: 'command' -- A maximum of 50 labels are allowed
sed: too many line numbers 'command' -- More than 256 different line
numbers were used or more than 50 + addresses were used
sed: too many w files 'command' -- A maximum of 10 output files is allowed
sed: too many {'s 'command' -- A '{' command did not have a matching '}'
command
sed: too many }'s 'command' -- A '}' command appeared before an opening '{'
command
sed: too much text: 'command' -- The internal command text buffer
overflowed processing the command
h**2 Documentation 21 September 1991 7
SED(1) USER DOCUMENTATION SED(1)
sed: undefined label 'label' -- The listed 'label' was never defined on a :
command
sed: unknown flag 'option' -- The listed 'option' is not allowed on the
invoking line for sed
The following warning may be displayed during compilation:
sed: Label not used 'label' -- The listed 'label' was defined but never
referenced.
During the actual editing the following fatal errors can occur:
sed: append too long after line 'number' -- An A, G, H, or N command
created a line in the buffer longer than 4000 characters
sed: cannot open 'file' -- The r command could not open 'file'
sed: infinite branch loop at line 'number' -- More than 50 branches were
taken without the editing of the line completing
sed: line too long at line 'number' -- While the s command was performing a
substitution the line length exceeded 4000 characters
sed: RE bad code 'code' -- An internal processing error has occurred while
matching a 'regular expression'
sed: too many appends after line 'number' -- An append command caused more
than 20 reads and appends for the given line
sed: too many reads after line 'number' -- A read command caused more than
20 reads and appends for the give line
BUGS
I tried to fix every problem I could find, but I believe the follow bugs
still exist in this version:
* The getline routine can overflow the buffer before checking for
overflow
* I still do not know exactly what the D command should do
* Strange options on the s command are allowed
* The handling of inrange for '{' commands
* Error processing could be improved
* All output files are overwritten even when there are errors
h**2 Documentation 21 September 1991 8
SED(1) USER DOCUMENTATION SED(1)
COMPATIBILITY
This version of sed is a modification of the Internet supplied GNU version.
That version was reverse-engineered from BSD 4.1 UNIX sed. The following
changes, modifications and improvements have been made:
* There is no hidden length limit (40 in BSD sed) on 'wfile' names.
* There is no limit (8 in BSD sed) on the length of 'labels'.
* The exchange command now works for long pattern and hold spaces.
* 'Escape sequences' are inhibited for both 'label's and 'filename's.
* All commands not having a 'text' argument can be separated by ";" or
can have trailing comments (#)
* a, c and i commands don't insist on a leading backslash '\n' in the
text.
* r and w commands do not insist on whitespace before the filename.
* The g, P, p and 'n' options on s commands may be given in any order.
* Escape sequences are valid in all contexts except file names and
labels.
* The full range of characters are allowed all 256 values.
* In an 'RE', '+' calls for 1...n repeats of the previous pattern.
* The l command produces a different format than the UNIX sed.
* The W command (write first line of pattern space to file).
* The T command (branch on last substitution failed).
* sed's error messages have been made more specific and informative and
cause processing to halt.
* + allowed in the second address.
* The empty RE "//" is allowed as a first address if a previous RE has
been compiled
* The -e and -f command line options do not require their arguments to
be separate options.
* If no arguments are given sed prints usage data.
* In all contexts a blank file name means 'stdout'.
This version otherwise appears to be equivalent to the UNIX version on the
Sun4 computer. That is I believe anything that sed did on that system,
this version of sed will do the same on either a Sun4 or a PC under DOS. If
h**2 Documentation 21 September 1991 9
SED(1) USER DOCUMENTATION SED(1)
anyone can really explain what sed is really supposed to do as explained in
the UNIX documentation I would appreciate the information. The manual page
refers to ed for further details which is ambiguous at best and the
description I read for either the D command or the options for the s
command were not understandable by me.
I would appreciate any comments, suggestions and even bug reports. I some-
times can be reached on INTERNET (the fiscal year is almost over), but you
can always contact me by mail or phone
Howard Helman (helman@elm.sdd.trw.com)
Box 340
Manhattan Beach, CA 90266
213.372.5387 or after 11/1/91 310.372.538
h**2 Documentation 21 September 1991 10